132 research outputs found

    Direct-to-Patient Survey for Diagnosis of Benign Paroxysmal Positional Vertigo

    Get PDF
    Given the high incidence of dizziness and its frequent misdiagnosis, we aim to create a clinical support system to classify the presence or absence of benign paroxysmal positional vertigo with high accuracy and specificity. This paper describes a three-phase study currently underway for classification of benign paroxysmal positional vertigo, which includes diagnosis by a specialist in a clinical setting. Patient background information is collected by a survey on an Android tablet and machine learning techniques are applied for classification. Decision trees and wrappers are employed for their ability to provide information about the question set. One goal of the study is to attain an optimal question set. Each phase of the study presents a unique set and style of questions. Results achieved in the first two phases of the survey indicate that our approach using decision trees with filters or wrappers does a good job of identifying benign paroxysmal positional vertigo

    Probabilistic Anomaly Detection in Natural Gas Time Series Data

    Get PDF
    This paper introduces a probabilistic approach to anomaly detection, specifically in natural gas time series data. In the natural gas field, there are various types of anomalies, each of which is induced by a range of causes and sources. The causes of a set of anomalies are examined and categorized, and a Bayesian maximum likelihood classifier learns the temporal structures of known anomalies. Given previously unseen time series data, the system detects anomalies using a linear regression model with weather inputs, after which the anomalies are tested for false positives and classified using a Bayesian classifier. The method can also identify anomalies of an unknown origin. Thus, the likelihood of a data point being anomalous is given for anomalies of both known and unknown origins. This probabilistic anomaly detection method is tested on a reported natural gas consumption data set

    Data Improving in Time Series Using ARX and ANN Models

    Get PDF
    Anomalous data can negatively impact energy forecasting by causing model parameters to be incorrectly estimated. This paper presents two approaches for the detection and imputation of anomalies in time series data. Autoregressive with exogenous inputs (ARX) and artificial neural network (ANN) models are used to extract the characteristics of time series. Anomalies are detected by performing hypothesis testing on the extrema of the residuals, and the anomalous data points are imputed using the ARX and ANN models. Because the anomalies affect the model coefficients, the data cleaning process is performed iteratively. The models are re-learned on “cleaner” data after an anomaly is imputed. The anomalous data are reimputed to each iteration using the updated ARX and ANN models. The ARX and ANN data cleaning models are evaluated on natural gas time series data. This paper demonstrates that the proposed approaches are able to identify and impute anomalous data points. Forecasting models learned on the unclean data and the cleaned data are tested on an uncleaned out-of-sample dataset. The forecasting model learned on the cleaned data outperforms the model learned on the unclean data with 1.67% improvement in the mean absolute percentage errors and a 32.8% improvement in the root mean squared error. Existing challenges include correctly identifying specific types of anomalies such as negative flows

    An Ensemble Model of QSAR Tools for Regulatory Risk Assessment

    Get PDF
    Quantitative structure activity relationships (QSARs) are theoretical models that relate a quantitative measure of chemical structure to a physical property or a biological effect. QSAR predictions can be used for chemical risk assessment for protection of human and environmental health, which makes them interesting to regulators, especially in the absence of experimental data. For compatibility with regulatory use, QSAR models should be transparent, reproducible and optimized to minimize the number of false negatives. In silico QSAR tools are gaining wide acceptance as a faster alternative to otherwise time-consuming clinical and animal testing methods. However, different QSAR tools often make conflicting predictions for a given chemical and may also vary in their predictive performance across different chemical datasets. In a regulatory context, conflicting predictions raise interpretation, validation and adequacy concerns. To address these concerns, ensemble learning techniques in the machine learning paradigm can be used to integrate predictions from multiple tools. By leveraging various underlying QSAR algorithms and training datasets, the resulting consensus prediction should yield better overall predictive ability. We present a novel ensemble QSAR model using Bayesian classification. The model allows for varying a cut-off parameter that allows for a selection in the desirable trade-off between model sensitivity and specificity. The predictive performance of the ensemble model is compared with four in silico tools (Toxtree, Lazar, OECD Toolbox, and Danish QSAR) to predict carcinogenicity for a dataset of air toxins (332 chemicals) and a subset of the gold carcinogenic potency database (480 chemicals). Leave-one-out cross validation results show that the ensemble model achieves the best trade-off between sensitivity and specificity (accuracy: 83.8 % and 80.4 %, and balanced accuracy: 80.6 % and 80.8 %) and highest inter-rater agreement [kappa (Îş): 0.63 and 0.62] for both the datasets. The ROC curves demonstrate the utility of the cut-off feature in the predictive ability of the ensemble model. This feature provides an additional control to the regulators in grading a chemical based on the severity of the toxic endpoint under study

    A New Temporal Pattern Identification Method For Characterization And Prediction Of Complex Time Series Events

    Get PDF
    A new method for analyzing time series data is introduced in this paper. Inspired by data mining, the new method employs time-delayed embedding and identifies temporal patterns in the resulting phase spaces. An optimization method is applied to search the phase spaces for optimal heterogeneous temporal pattern clusters that reveal hidden temporal patterns, which are characteristic and predictive of time series events. The fundamental concepts and framework of the method are explained in detail. The method is then applied to the characterization and prediction, with a high degree of accuracy, of the release of metal droplets from a welder. The results of the method are compared to those from a Time Delay Neural Network and the C4.5 decision tree algorithm

    Analyzing Logistic Map Pseudorandom Number Generators for Periodicity Induced by Finite Precision Floating-Point Representation

    Get PDF
    Because of the mixing and aperiodic properties of chaotic maps, such maps have been used as the basis for pseudorandom number generators (PRNGs). However, when implemented on a finite precision computer, chaotic maps have finite and periodic orbits. This manuscript explores the consequences finite precision has on the periodicity of a PRNG based on the logistic map. A comparison is made with conventional methods of generating pseudorandom numbers. The approach used to determine the number, delay, and period of the orbits of the logistic map at varying degrees of precision (3 to 23 bits) is described in detail, including the use of the Condor high-throughput computing environment to parallelize independent tasks of analyzing a large initial seed space. Results demonstrate that in terms of pathological seeds and effective bit length, a PRNG based on the logistic map performs exponentially worse than conventional PRNGs

    Generalized Phase Space Projection for Nonlinear Noise Reduction

    Get PDF
    Improved phase space projection methods, adapted from related work in the linear signal processing field based on subspace decomposition, are presented for application to the problem of additive noise reduction in the context of phase space analysis. These methods improve upon existing methods such as Broomhead–King singular spectrum analysis projection by minimizing overall signal distortion subject to constraints on the residual error, rather than using a direct least-squares fit. This results in a range of weighted projections which estimate and compensate for the portion of the principal component\u27s singular values corresponding to noise rather than signal energy, and which include least-squares (LS) and least minimum mean square error (LMMSE) as subcases. The nature of phase space covariance, the key element in construction of the projection matrix, is examined across global phase spaces as well as within local neighborhood regions. The resulting algorithm, illustrated on a noisy Henon map as well as on the task of speech enhancement, is applicable to a wide variety of nonlinear noise reduction tasks

    Inter-Turn Fault Diagnosis in Induction Motors Using the Pendulous Oscillation Phenomenon

    Get PDF
    A robust interturn fault diagnostic approach based on the concept of magnetic field pendulous oscillation, which occurs in induction motors under faulty conditions, is introduced in this paper. This approach enables one to distinguish and classify an unbalanced voltage power supply and machine manufacturing/construction imperfections from an interturn fault. The experimental results for the two case studies of a set of 5-hp and 2-hp induction motors verify the validity of the proposed approach. Moreover, it can be concluded from the experimental results that if the circulating current level in the shorted loop increases beyond the phase current level, an interturn fault can be easily detected using the proposed approach even in the presence of the existence of motor manufacturing imperfection effects

    Statistical Models of Reconstructed Phase Spaces for Signal Classification

    Get PDF
    This paper introduces a novel approach to the analysis and classification of time series signals using statistical models of reconstructed phase spaces. With sufficient dimension, such reconstructed phase spaces are, with probability one, guaranteed to be topologically equivalent to the state dynamics of the generating system, and, therefore, may contain information that is absent in analysis and classification methods rooted in linear assumptions. Parametric and nonparametric distributions are introduced as statistical representations over the multidimensional reconstructed phase space, with classification accomplished through methods such as Bayes maximum likelihood and artificial neural networks (ANNs). The technique is demonstrated on heart arrhythmia classification and speech recognition. This new approach is shown to be a viable and effective alternative to traditional signal classification approaches, particularly for signals with strong nonlinear characteristics

    Time-Domain Isolated Phoneme Classification Using Reconstructed Phase Spaces

    Get PDF
    This paper introduces a novel time-domain approach to modeling and classifying speech phoneme waveforms. The approach is based on statistical models of reconstructed phase spaces, which offer significant theoretical benefits as representations that are known to be topologically equivalent to the state dynamics of the underlying production system. The lag and dimension parameters of the reconstruction process for speech are examined in detail, comparing common estimation heuristics for these parameters with corresponding maximum likelihood recognition accuracy over the TIMIT data set. Overall accuracies are compared with a Mel-frequency cepstral baseline system across five different phonetic classes within TIMIT, and a composite classifier using both cepstral and phase space features is developed. Results indicate that although the accuracy of the phase space approach by itself is still currently below that of baseline cepstral methods, a combined approach is capable of increasing speaker independent phoneme accuracy
    • …
    corecore